Measuring the quality of multi-document cluster headlines

نویسندگان

  • Frank van Kesteren
  • Wessel Kraaij
چکیده

Headline summaries of multi-document clusters enable efficient navigation and selection of content, provided headlines are of sufficient quality. This study compares several methods for automated headline extraction, with considerable variation in length. The reliability of the automated evaluation is validated by a comparison with human produced headlines, taking into consideration the variability in manually created headlines and inter-human agreement in quality judgements. Results suggest that, ROUGE precision is a suitable measure for automatic evaluation of headlines of differing lengths. Also ROUGE Recall can be used after applying a length penalty.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selecting Labels for News Document Clusters

This work deals with determination of meaningful and terse cluster labels for News document clusters. We analyze a number of alternatives for selecting headlines and/or sentences of document in a document cluster (obtained as a result of an entity-event-duration query), and formalize an approach to extracting a short phrase from well-supported headlines/sentences of the cluster that can serve a...

متن کامل

Spatial Analysis of Physical Quality of Rural Housing in Iran

Housing enjoys a multilateral functioning in the rural system. One of the aspects highlighted by planning system is the renewal and rehabilitation of housing. In our country, Iran, development of rural housing has experienced a growing trend, especially in the physical and structural aspects. However, a large part of the rural population in different areas of the country is living in non-resist...

متن کامل

Improving Energy Consumption by Using Cluster Based Routing Algorithm in Wireless Sensor Networks

Multi-path is favorite alternative for sensor networks, as it provides an easy mechanism to distributetraffic, as well as considerate fault tolerance. In this paper, a new clustering based multi path routingprotocol namely ECRR (Energy efficient Cluster based Routing algorithm for improving Reliability) isproposed, which is a new routing algorithm and guarantees the achievement to required QoS ...

متن کامل

MEASURING SOFTWARE PROCESSES PERFORMANCE BASED ON FUZZY MULTI AGENT MEASUREMENTS

The present article discusses and presents a new and comprehensive approachaimed at measuring the maturity and quality of software processes. This method has beendesigned on the basis of the Software Capability Maturity Model (SW-CMM) and theMulti-level Fuzzy Inference Model and is used as a measurement and analysis tool. Among themost important characteristics of this method one can mention si...

متن کامل

English and Persian Sport Newspaper Headlines: A comparative study of linguistic means

Abstract Using rhetorical figures in specialized languages like the language of newspaper headlines is common. The present study attempted to conduct a contrastive analysis of the English and Persian sport newspaper headlines related to the 2014 FIFA World Cup. Toward this end, a corpus consisting of 400 English and 400 Persian headlines published during 12th of June to 13th of July, 2014 was c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006